Skip to content

add a metric to observe worker active time#87

Open
zyguan wants to merge 1 commit intotikv:masterfrom
zyguan:dev/observability-3
Open

add a metric to observe worker active time#87
zyguan wants to merge 1 commit intotikv:masterfrom
zyguan:dev/observability-3

Conversation

@zyguan
Copy link
Contributor

@zyguan zyguan commented Feb 11, 2026

Adds a new Prometheus metric yatp_worker_active_seconds_total to observe how much time worker threads spend being active.

Summary by CodeRabbit

  • New Features

    • Added worker thread activity tracking metric that measures total time worker threads spend active (not parked) in seconds, labeled by thread pool name for better performance visibility and monitoring.
  • Tests

    • Added test for worker activity metric functionality.

Signed-off-by: zyguan <zhongyangguan@gmail.com>
@coderabbitai
Copy link

coderabbitai bot commented Feb 11, 2026

📝 Walkthrough

Walkthrough

These changes introduce worker activity tracking to the thread pool via a new WORKER_ACTIVE_SECONDS Prometheus metric. The metric is defined in metrics.rs, wired into the pool builder, and tracked through a WorkerActivity struct that monitors worker lifecycle events (startup, task completion, shutdown) with periodic checkpointing to flush activity data.

Changes

Cohort / File(s) Summary
Metric Definition & Initialization
src/metrics.rs, src/pool/builder.rs
Added WORKER_ACTIVE_SECONDS CounterVec metric; pool builder retrieves the metric and enables worker activity tracking by calling enable_worker_activity() on each local queue.
Worker Activity Tracking
src/pool/spawn.rs
Introduced WorkerActivity struct for per-worker activity tracking; added activity field to Local<T>; implemented lifecycle methods (enable_worker_activity, on_worker_start, on_worker_end, on_task_complete) with periodic checkpointing and elapsed time accounting.
Worker Lifecycle Integration
src/pool/worker.rs
Integrated lifecycle hooks into worker execution flow: calls on_worker_start() after runner initialization, on_task_complete() after each task, and on_worker_end() after draining all futures.
Test Coverage
src/pool/tests.rs
Added test_worker_active_seconds_metric to verify metric collection; creates single-thread pool, executes task with brief sleep, and asserts metric counter exceeds zero.

Sequence Diagram

sequenceDiagram
    participant Pool as Thread Pool
    participant Worker as Worker Thread
    participant Local as Local<T>
    participant Activity as WorkerActivity
    participant Metric as Prometheus Counter

    Pool->>Worker: Spawn worker thread
    Worker->>Local: Initialize with activity field
    Worker->>Local: on_worker_start()
    Local->>Activity: Record start time
    Activity->>Metric: Begin tracking (if enabled)

    loop For Each Task
        Worker->>Local: Execute task
        Local->>Activity: Monitor execution
        Worker->>Local: on_task_complete()
        Local->>Activity: Check elapsed time
        alt Periodic checkpoint triggered
            Activity->>Activity: Flush accumulated time
            Activity->>Metric: Update counter with active seconds
        end
    end

    Worker->>Local: Drain remaining futures
    Worker->>Local: on_worker_end()
    Local->>Activity: Final flush
    Activity->>Metric: Commit final active seconds
    Local->>Local: Clear activity state
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Hops of joy through the threads we go,
Counting seconds as workers flow,
Active time tracked with metrics so true,
Each checkpoint a hop, each flush renewed! 🎉

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding a metric to track worker active time, which aligns with the introduction of WORKER_ACTIVE_SECONDS metric throughout the codebase.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments
src/pool/spawn.rs (1)

215-220: Minor: unrelated housekeeping change.

The #[allow(dead_code)] additions for the assertion traits are reasonable (they're compile-time-only constraints), though unrelated to the metric feature. Consider splitting into a separate commit for cleaner history.

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Prometheus counter metric to track how long worker threads are “active” (i.e., not parked), enabling better visibility into thread-pool utilization.

Changes:

  • Introduces yatp_worker_active_seconds_total (WORKER_ACTIVE_SECONDS) labeled by pool name.
  • Tracks worker active time via per-worker local counters and flushes on park/end (with periodic checkpoints while busy).
  • Adds a unit test validating the metric is incremented.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
src/metrics.rs Defines the new WORKER_ACTIVE_SECONDS CounterVec metric.
src/pool/builder.rs Wires a per-pool labeled counter into each worker via LocalCounter.
src/pool/spawn.rs Implements WorkerActivity tracking and hooks into park/unpark and worker lifecycle.
src/pool/worker.rs Calls lifecycle hooks (on_worker_start, on_task_complete, on_worker_end) from the worker loop.
src/pool/tests.rs Adds a test ensuring the active-seconds metric increases after running work.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@cfzjywxk cfzjywxk requested review from cfzjywxk and you06 February 14, 2026 02:00
Copy link

@cfzjywxk cfzjywxk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants